Using machine learning methods to avoid the pitfall of cognates and false friends in Spanish-Portuguese word pairs

نویسندگان

  • Lianet Sepúlveda Torres
  • Sandra M. Aluísio
چکیده

The fact that 85% of the Portuguese lexicon contains Spanish cognates and that the linguistic structures of both languages are highly coincident is believed to be an advantage for the Spanish speaker who learns Portuguese. However, these similarities have some negative aspects in the learning of Portuguese, such as, the pitfall of false friends, since about 20% of cognates are false. The aim of this article is to identify cognates and false friends between Spanish and Portuguese automatically to build dictionaries of these words. One of the uses for these dictionaries is to support scientific writing tools, which can help lower barriers for Spanish speakers when they write in Portuguese.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disambiguation of partial cognates

Cognates – words that have similar spelling and meaning in two or more languages – can accelerate vocabulary acquisition and facilitate the reading comprehension task. A student has to pay attention to the pairs of words that look and sound similar but have different meanings – false-friend pairs, and especially to pairs of words that share meanings in some but not all contexts – partial cognat...

متن کامل

Unsupervised Extraction of False Friends from Parallel Bi-Texts Using the Web as a Corpus

False friends are pairs of words in two languages that are perceived as similar, but have different meanings, e.g., Gift in German means poison in English. In this paper, we present several unsupervised algorithms for acquiring such pairs from a sentence-aligned bi-text. First, we try different ways of exploiting simple statistics about monolingual word occurrences and cross-lingual word co-occ...

متن کامل

Cognate or False Friend? Ask the Web!

We propose a novel unsupervised semantic method for distinguishing cognates from false friends. The basic intuition is that if two words are cognates, then most of the words in their respective local contexts should be translations of each other. The idea is formalised using the Web as a corpus, a glossary of known word translations used as cross-linguistic “bridges”, and the vector space model...

متن کامل

Multilingual lexical resources to detect cognates in non-aligned texts

The identification of cognates between two distinct languages has recently started to attract the attention of NLP research, but there has been little research into using semantic evidence to detect cognates. The approach presented in this paper aims to detect English-French cognates within monolingual texts (texts that are not accompanied by aligned translated equivalents), by integrating word...

متن کامل

Exploring cognition processes in second language acquisition: the case of cognates and false-friends in EST

This article explores one aspect of the processing perspective in L2 learning in an EST context: the processing of new content words, in English, of the type ‘cognates’ and ‘false friends’, by Spanish speaking engineering students. The paper does not try to offer a comprehensive overview of language acquisition mechanisms, but rather it is intended to review more narrowly how our conceptual sys...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011